In [62]:
import pandas as pd 
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
plt.style.use('ggplot')

Data Visualization Worksheet

This worksheet will walk you through the basic process of preparing a visualization using Python/Pandas/Matplotlib.

For this exercise, we will be creating a line plot comparing the number of hosts infected by the Bedep and ConfickerAB Bot Families in the Government/Politic sector.

Prepare the data

The data we will be using is in the dailybots.csv file which can be found in the data folder. As is common, we will have to do some data wrangling to get it into a format which we can use to visualize this data. To do that, we'll need to:

  1. Read in the data
  2. Filter the data by industry and botnet The result should look something like this:
date ConflikerAB Bedep
0 2016-06-01 255 430
1 2016-06-02 431 453

The way I chose to do this in the answer notebook, might be a little more complex, but I wanted you to see all the steps involved.


In [ ]:

Create the first chart

Using the .plot() method, plot your dataframe and see what you get.


In [ ]:

Customizing your plot:

The default plot doesn't look horrible, but there are certainly some improvements which can be made. Try the following:

  1. Change the x-axis to a date by converting the date column to a date object.
  2. Move the Legend to the upper center of the graph
  3. Make the figure size larger.
  4. Instead of rendering both lines on one graph, split them up into two plots
  5. Add axis labels

There are many examples in the documentation which is available: http://pandas.pydata.org/pandas-docs/version/0.18.1/visualization.html

A few hints: http://stackoverflow.com/questions/4700614/how-to-put-the-legend-out-of-the-plot http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.plot.html


In [ ]:

Making it Interactive

Using Bokeh, create an interactive chart of the same data.


In [58]:
from bokeh.plotting import output_notebook
output_notebook()


Loading BokehJS ...

In [59]:
from bokeh.charts import ... #Your code here..
from bokeh.io import show

In [ ]: